Install Python libraries for specific needs

Note: To install these libraries in Visual Studio Code (VS Code), you don't actually install them "into" VS Code itself; rather, you install them into a Python Environment that VS Code then uses to run your code.
The professional standard is to use a Virtual Environment. This keeps your projects organized and prevents library versions from clashing (e.g., your Healthcare project won't break your Finance project).
Note: Go to Step 4 if you are already in the virtual environment
Step 1: Open the Terminal in VS Code
- Open your project folder in VS Code.
- Open the integrated terminal by pressing
Ctrl + `(backtick) or going to Terminal > New Terminal in the top menu.
Step 2: Create a Virtual Environment
In the terminal, type the following command:
Windows:
python -m venv .venv
macOS/Linux:
python3 -m venv .venv
This creates a folder named
.venv
in your project directory which will hold your libraries.
Step 3: Select the Environment in VS Code
- Press
Ctrl + Shift + Pto open the Command Palette. - Type "Python: Select Interpreter" and click it.
- Choose the one that says "Python 3.x.x ('.venv': venv)". This tells VS Code to use the local environment you just created.
Step 4: Install the Libraries
Now, use
pip
(Python's package manager) to install the libraries you need. You can install them one by one or all at once.
To install specific libraries as per your needs as shown below:
# Data Analysis
pip install pandas numpy matplotlib seaborn
# AI & Machine Learning
pip install scikit-learn torch torchvision torchaudio
# Finance
pip install yfinance quantlib pandas-ta
# Healthcare
pip install pydicom biopython lifelines
Pro Tip: Using a
requirements.txt
file
If you have many libraries, it is best to list them in a text file named
requirements.txt
.
- Create a new file in VS Code named
requirements.txt. - List your libraries inside:
-
pandas -
scikit-learn -
yfinance -
pydicom
3. Run this single command in your terminal:
-
pip install -r requirements.txt
Troubleshooting Tips
- "Pip is not recognized": On Windows, try using
python -m pip install ...instead of justpip install. - Permissions: If you get a permission error, ensure you have activated your virtual environment. On Windows, run
.venv\Scripts\activate; on Mac/Linux, runsource .venv/bin/activate. - VS Code Extension: Make sure you have the Python extension by Microsoft installed from the VS Code Marketplace.
Recommendations:
Creating a
requirements.txt
file is the best way to ensure your project is reproducible. Below is a comprehensive, categorized template that includes the industry-standard libraries we discussed.
You can copy this entire block and save it as a file named
requirements.txt
in your VS Code project folder.
Plaintext
###############################################################################
# 1. DATA ANALYSIS & VISUALIZATION
###############################################################################
numpy # Numerical computing and arrays
pandas # Data manipulation and DataFrames
matplotlib # Basic 2D plotting
seaborn # Statistical data visualization
plotly # Interactive web-based visualizations
scipy # Scientific computing and advanced math
###############################################################################
# 2. ARTIFICIAL INTELLIGENCE (AI) & MACHINE LEARNING
###############################################################################
scikit-learn # Traditional ML (Regression, Classification)
torch # PyTorch - Deep Learning & Neural Networks
torchvision # Computer Vision utilities for PyTorch
transformers # Hugging Face - NLP and Generative AI
langchain # Framework for LLM applications
opencv-python # Computer Vision and image processing
###############################################################################
# 3. WEB DEVELOPMENT
###############################################################################
django # High-level "Batteries-included" web framework
flask # Lightweight micro-framework
fastapi # High-performance API framework
uvicorn # ASGI server for FastAPI
streamlit # Building data apps quickly
beautifulsoup4 # Web scraping and HTML parsing
###############################################################################
# 4. FINANCIAL ANALYSIS & ALGORITHMIC TRADING
###############################################################################
yfinance # Downloading Yahoo Finance market data
QuantLib # Quantitative finance modeling and risk
pandas-ta # Technical analysis indicators
backtrader # Strategy backtesting framework
pyportfolioopt # Portfolio optimization (Mean-Variance)
###############################################################################
# 5. HEALTHCARE & MEDICAL RESEARCH
###############################################################################
pydicom # Handling DICOM medical imaging files
biopython # Bioinformatics and biological computation
lifelines # Survival analysis for clinical trials
pyhealth # Deep learning for healthcare datasets
How to use this file in VS Code:
- Open your Terminal (
Ctrl + `). - Ensure your Virtual Environment is active (You should see
(.venv)at the start of your terminal line). - Run the install command:
pip install -r requirements.txt
[!TIP]
A Note on AI Libraries: If you have a dedicated NVIDIA GPU, you may need a specific version of
torch
to enable hardware acceleration. If you don't have a GPU, the command above will install the standard CPU version.
Resources:
1. Data Analysis & Visualization
These libraries form the "Scientific Python" stack, used for cleaning, manipulating, and visualizing data.
- Pandas: The industry standard for data manipulation. It provides
DataFramesfor handling tabular data (like Excel/SQL). - NumPy: Essential for numerical computing. It supports large, multi-dimensional arrays and high-level mathematical functions.
- Matplotlib: The foundation for 2D plotting. Best for creating static, high-quality charts and graphs.
- Seaborn: Built on Matplotlib, it makes statistical graphics look much more professional with less code.
- SciPy: Used for scientific and technical computing, including integration, optimization, and signal processing.
- Plotly: Ideal for creating interactive and web-ready visualizations.
2. Artificial Intelligence (AI) & Machine Learning
Python's AI ecosystem is divided into "Traditional ML" and "Deep Learning" (Neural Networks).
- Scikit-learn: The most important library for traditional machine learning (regression, classification, and clustering).
- PyTorch: Currently the most popular framework for AI research and deep learning, known for its flexibility and "Pythonic" feel.
- TensorFlow / Keras: Developed by Google, this is a robust ecosystem for building and deploying production-scale deep learning models.
- Hugging Face (Transformers): The go-to library for modern Generative AI and Natural Language Processing (NLP).
- LangChain: A newer framework specifically designed to build applications powered by Large Language Models (like GPT-4 or Llama).
- OpenCV: The standard for Computer Vision tasks like facial recognition and image processing.
3. Web Development
Python offers frameworks for building everything from simple APIs to massive social media platforms.
- Django: A "batteries-included" framework. It comes with a built-in admin panel, database ORM, and security features. Best for large, complex sites.
- Flask: A "micro-framework." It is lightweight and flexible, making it perfect for smaller apps and beginners.
- FastAPI: The modern choice for building high-performance APIs. It is extremely fast and uses Python type hints for automatic validation.
- Streamlit: Specifically for data scientists. It allows you to turn a data script into a shareable web app in minutes without knowing HTML/CSS.
- Beautiful Soup / Scrapy: Essential for Web Scraping (collecting data from websites to use in your AI or analysis projects).
4. Financial Analysis & Algorithmic Trading
Financial data is unique because it is heavily time-series based and requires extreme precision.
Data Retrieval & Market Access
- yfinance: The most popular community-driven library for downloading historical market data from Yahoo Finance.
- Alpha Vantage / Quandl (Nasdaq Data Link): APIs used to pull professional-grade financial, economic, and alternative data.
- CCXT: A "must-have" for cryptocurrency traders; it provides a unified way to connect to over 100 different crypto exchanges.
Quantitative Modeling & Analytics
- QuantLib: An industry-standard library for modeling, trading, and risk management. It is used for complex tasks like pricing derivatives and interest rate modeling.
- PyPortfolioOpt: A specialized library for Portfolio Optimization. It helps you calculate the best way to allocate assets to maximize returns for a given level of risk (Mean-Variance Optimization).
- Arch: Used for financial econometrics, specifically for modeling volatility (GARCH models) and performing unit root tests.
Technical Analysis & Indicators
- TA-Lib: The "gold standard" for technical analysis. It includes over 150 indicators like RSI, MACD, and Bollinger Bands.
- Pandas-ta: A more "Pythonic" and easy-to-use alternative to TA-Lib that integrates directly with Pandas DataFrames.
Backtesting (Testing Strategies on Past Data)
- Backtrader: A powerful framework for testing trading strategies. It allows you to simulate trades, handle commissions, and visualize results on a chart.
- Zipline Reloaded: The engine that originally powered Quantopian. It is an event-driven backtester that is very realistic regarding how trades are executed in the real world.
- VectorBT: A high-performance library that uses NumPy and Numba to backtest millions of strategies at once using "vectorization."
5. Healthcare & Medical Research
These libraries are specialized for medical data standards, bioinformatics, and clinical diagnostics.
Medical Imaging
- Pydicom: The primary library for handling DICOM files (the standard format for X-rays, MRIs, and CT scans). It allows you to read, modify, and write medical image metadata.
- SimpleITK / ITK: Used for image segmentation and registration. It’s essential for tasks like identifying a tumor in an organ or overlaying two different scans.
- Nibabel: Specifically designed for neuroimaging, it helps in reading and writing common formats like NiFTI used in brain research.
Bioinformatics & Genomics
- Biopython: The gold standard for biological computation. It provides tools for sequence analysis (DNA/RNA), protein structure modeling, and accessing online biological databases like NCBI.
- Scanpy: A specialized library for analyzing single-cell gene expression data, used heavily in modern cancer and immunology research.
- Pysam: A wrapper for SAMtools, used to manipulate high-throughput sequencing data (BAM/SAM/VCF files).
Interoperability & Health Data Standards
- FHIRCraft / fhir.resources: These libraries help developers work with FHIR (Fast Healthcare Interoperability Resources), the modern global standard for exchanging electronic health records.
- python-hl7: A simple library used to parse and send HL7 v2.x messages, which is the legacy messaging standard still used by most hospital machines and lab systems.
Clinical AI & EHR Analysis
- PyHealth: A comprehensive deep learning library designed specifically for healthcare AI. It helps researchers build predictive models for tasks like mortality prediction or medication recommendation using clinical datasets (like MIMIC-III).
- Ehrapy: Built for exploratory analysis of Electronic Health Records. it treats EHR data similarly to genomic data, allowing for advanced clustering and survival analysis.
- Lifelines: The go-to library for Survival Analysis, used in clinical trials to estimate the time until a specific event (like patient recovery or relapse) occurs.